291 research outputs found
Recommended from our members
An Adaptive Neuro-Fuzzy System with Semi-Supervised Learning as an Approach to Improving Data Classification: An Illustration of Bad Debt Recovery in Healthcare
Business analytics has become an increasingly important priority for organizations today as they strive to achieve greater competitiveness. As organizations adopt business practices that rely on complex, large-scale data, new challenges also emerge. A common situation in business analytics is concerned with appropriate and adequate methods for dealing with unlabeled data in classification. This study examines the effectiveness of a semi-supervised learning approach to classify unlabeled data to improve classification accuracy rates. The context for our study is healthcare. The healthcare costs in the U.S. have risen at an alarming rate over the last two decades. One of the causes for the rising costs could be attributed to medical bad debt, i.e., debt that is not recovered by healthcare institutions. A major obstacle to debt classification, hence better debt recovery, is the presence of unlabeled cases, a situation not uncommon in many other business contexts. There is surprisingly very little research that explores the performance of computational intelligence and soft computing methods in improving bad debt recovery in the healthcare industry. Using a real data set from a healthcare organization, we address this important research gap by examining the performance of an adaptive neuro-fuzzy inference system (ANFIS) with semi-supervised learning (SSL) in improving debt recovery rate. In particular, this study explores the role of ANFIS in conjunction with SSL in classifying unknown cases (those that were not pursued for debt collection) as either a good case (recoverable) or a bad case (unrecoverable). Healthcare institutions can then pursue these potentially good cases and improve their debt recovery rates. Test results show that ANFIS with SSL is a viable method. Our models generated better classification accuracy rates than those in prior studies. These results and their analysis show the potential of ANFIS with SSL models in classifying unknown cases, which are a potential source of revenue recovery for health care organizations. The significance of this research extends to all types of organizations that face an increasingly urgent need to adopt reliable practices for business analytics
Improving Prediction Models for Mass Assessment: A Data Stream Approach
Mass appraisal is the process of valuing a large collection of properties within a city/municipality usually for tax purposes. The common methodology for mass appraisal is based on multiple regression though this methodology has been found to be deficient. Data mining methods have been proposed and tested as an alternative but the results are very mixed. This study introduces a new approach to building prediction models for assessing residential property values by treating past sales transactions as a data stream. The study used 110,525 sales transaction records from a municipality in the Midwest of the US. Our results show that a data stream based approach outperforms the traditional regression approach, thus showing its potential in improving the performance of prediction models for mass assessment
An Innovative Approach to Modeling Aviation Safety Incidents
Due to the complexity of aviation safety operations, the number of flight incidents continues to rise. The Aviation Safety Reporting System (ASRS) contains the largest collection of such incidents. Efficient and effective analysis of these incidents remains a challenge. This paper proposes a new approach to analyze aviation safety records using deep learning methods to improve incident classification. The proposed approach, CNN-LSTM, combines the characteristics of convolutional neural network (CNN) and long short-term memory (LSTM) neural network, and a distributed computing method to model aviation safety data. The five machine learning methods Logistic Regression, Naive Bayes, Random Forest, Support Vector Machine, Multi-layer Perceptron were used to compare with CNN-LSTM. The results show that CNN-LSTM model can significantly improve the accuracy rates of classification for aviation safety incident reports using Word2Vec. The distributed platform in Spark with clusters can make full use of computing resources when processing textual data from ASRS, reducing time-consumption greatly when compared with machine learning algorithms running on a standalone computer. Timely and accurate identification of causes of reported incidents is important. The results of this study demonstrate a new approach to improve both accuracy and efficiency in incident cause identification
Data Stream Models for Predicting Adverse Events in a War Theater
Predicting adverse events in a war theater has been an active area of research. Recent studies used machine learning methods to predict adverse events utilizing infrastructure development spending data as input variables. The goals of these studies were to find correlation and disclose the main factors between adverse events and human-social-infrastructure development projects, and reduce the occurrence of the adverse events. The predictions still have large errors compared with the real values using the existing methods. The reason could be that some significant variables are removed to comply with constraints in a soft computing model such as neural networks, fuzzy inference systems (FIS) and adaptive neuro-fuzzy inference systems (ANFIS) that work well with a smaller number of variables. In this paper, a data stream approach using three data stream regression algorithms, AMRules, TargetMean and FIMTDD, is proposed to predict the adverse events so that much more input variables could be included. The results show that the data stream methods generate better results than machine learning methods used in the previous studies, thus helping us better understand the relationship between infrastructure development and adverse events. In addition the data stream methods also outperform the traditional linear regression model. An important advantage in using data stream methods is the ability to create and apply predictive models with a relatively small amount of memory and time. Finally, the use of data stream methods provides an additional advantage by allowing the user to observe error distribution over time for more accurate assessment of the performance of the resulting models
Recommended from our members
An Innovative Clustering Approach to Market Segmentation for Improved Price Prediction
A main obstacle to accurate prediction is often the heterogeneous nature of data. Existing studies have pointed to data clustering as a potential solution to reduce heterogeneity, and therefore increase prediction accuracy. This paper describes an innovative clustering approach based on a novel adaptation of the Fuzzy C-Means algorithm and its application to market segmentation in real estate. Over 15,000 actual home sales transactions were used to evaluate our approach. The test results demonstrate that the accuracy in price prediction shows notable improvement for some clustered market segments. In comparison with existing methods our approach is simple to implement. It does not require additional collection of data or costly development of models to incorporate social-economic factors on segmentation. Finally our approach is not market specific and can be easily applied across different housing markets
Identification of Human Factors in Aviation Incidents Using a Data Stream Approach
This paper investigates the use of data streaming analytics to better predict the presence of human factors in aviation incidents with new incident reports. As new incidents data become available, the fresh information can help not only evaluate but also improve existing models. First, we use four algorithms in batch learning to establish a baseline for comparison purposes. These are NaiveBayes (NB), Cost Sensitive Classifier (CSC), Hoeffdingtree (VFDT), and OzabagADWIN (OBA). The traditional measure of the classification accuracy rate is used to test their performance. The results show that among the four, NB and CSC are the best classification algorithms. Then we test the classifiers in a data stream setting. The two performance measure methods Holdout and Interleaved Test-Then-Train or Prequential are used in this setting. The Kappa statistic charts of Prequential measure with a sliding window show that NB exhibits the best performance, and is better than the other algorithms. The two different measure methods, batch learning with 10-fold cross validation and data stream with Prequential measure, get one consistent result. CSC is a suitable for unbalanced data in batch learning, but it is not best in Kappa statistic for data stream. Valid incremental algorithms need to be developed for the data stream with unbalanced labels
Deep Learning in Predicting Real Estate Property Prices: A Comparative Study
The dominant methods for real estate property price prediction or valuation are multi-regression based. Regression-based methods are, however, imperfect because they suffer from issues such as multicollinearity and heteroscedasticity. Recent years have witnessed the use of machine learning methods but the results are mixed. This paper introduces the application of a new approach using deep learning models to real estate property price prediction. The paper uses a deep learning approach for modeling to improve the accuracy of real estate property price prediction with data representing sales transactions in a large metropolitan area. Three deep learning models, LSTM, GRU and Transformer, are created and compared with other machine learning and traditional models. The results obtained for the data set with all features clearly show that the RF and Transformer models outperformed the other models. LSTM and GRU models produced the worst results, suggesting that they are perhaps not suitable to predict the real estate price. Furthermore, the implementations of Transformer and RF on a data set with feature reduction produced even more accurate prediction results. In conclusion, our research shows that the performance of the Transformer model is close to the RF model. Both models produce significantly better prediction results than existing approaches in terms of accuracy
New Theoretical Results Concerning the Interstellar Abundance of Molecular Oxygen
The low abundance of molecular oxygen in cold cores of interstellar clouds poses a continuing problem to modelers of the chemistry of these regions. In chemical models O_2 is formed principally by the reaction between O and OH, which has been studied experimentally down to 39 K. It remains possible that the rate coefficient of this reaction at 10 K is considerably less than its measured value at 39 K, which might inhibit the production of O_2 and possibly bring theory and observation closer together over a wider range of times. Two theoretical determinations of the rate coefficient for the O + OH reaction at temperatures down to 10 K have been undertaken recently; both results show that the rate coefficient is indeed lower at 10 K than at 39 K, although they differ in the magnitude of the decrease. Here we show, using gas-phase models, how the calculated interstellar O_2 abundance in cold cores is affected by a decrease in the rate coefficient. We also consider its effect on other species. Our major finding is that for standard O-rich abundances, the calculated abundance of O_2 in cold cores is sufficiently low to explain observations only at early times regardless of the value of k_1 in the range investigated here. For C-rich abundances, on the other hand, late-time solutions can also be possible
Inertia matching of CNC cycloidal gear form grinding machine servo system
Reasonable ratio between the load inertia and servo motor inertia plays a decisive role for the dynamic performance and stability of the servo system, as well as the machining accuracy of the whole CNC machine. In order to improve the control performance and contour machining accuracy of the servo system of the CNC cycloidal gear form grinding machine, an optimization design method of the inertia matching for the CNC cycloidal gear form grinding machine servo system is proposed. The two-mass servo driving closed-loop PID control system is constructed, the influence of the different inertia ratios on the dynamic performance and contour errors of the servo system are deeply analyzed, and the inertia ratio is optimized to satisfied with the servo system performance requirements. Finally, the feasibility and practicability of the optimization design method of inertia matching are verified through the inertia ratio optimization grinding experiments of the cycloid gear in the CNC gear form grinding machine. This inertia matching optimization design method provides a valuable reference for the further design of CNC machine servo system
- …